Tabu search based algorithms for DNA sequencing
نویسندگان
چکیده
The DNA sequencing is an elementary approach in computational molecular biology, leading to recognizing genetic information of organisms. The information is encoded as a sequence of nucleotides (basic particles of DNA) composing double helix, which in human beings reaches the length of 3 billions. There are four types of the nucleotides: A, C, G, and T (the abbreviations of names of their nitrogenous bases: adenine, cytosine, guanine, and thymine). The order of nucleotides in DNA strands determines processes occuring in organisms, thus their structures and functions. The sequencing is the process of reading the nucleotide order of an unknown DNA fragment, of length usually up to 1000 nucleotides (the length depends on kind of biological experiment). Such sequences are further combined into larger contigs in the assembly process and analyzed toward extraction of genetic information. The approach to DNA sequencing discussed here is DNA sequencing by hybridization (SBH). It consists of two stages: first the biological data are produced in the hybridization experiment, next they become input to the computational phase, which ends with the reconstructed sequence. The output of the hybridization reaction can be viewed as a set (called spectrum) of words (oligonucleotides) over the alphabet {A, C, G, T}, being short subsequences of the studied DNA fragment. The aim of the DNA sequencing problem is to reconstruct the original DNA sequence of a known length on the basis of these overlapping oligonucleotides [20]. In the standard approach to SBH, the oligonucleotide library used in the hybridization experiment contains all possible oligonucleotides of a given constant length. The spectrum being output of the experiment is a subset of the library, i.e. the set of words of equal length composing the original sequence [22]. Contrary to it, in the isothermic approach to SBH, the library contains all oligonucleotides of constant temperature of melting oligonucleotide duplexes, but differing in lengths. This is due to assure more perfect chemical conditions, what results in lower number of experimental errors. In order to provide the spectrum with the certainty that the whole studied DNA fragment is covered by the oligonucleotides, the experimental phase must be carried out with two isothermic libraries differing by two degrees [1]. There are also another propositions modifying the standard SBH procedure, like for example multistage SBH [17] or SBH with universal nitrogenous bases [21]. For both standard and isothermic SBH, the computational complexity of several variants of the combinatorial problem is already known and the corresponding variants of the two approaches belong to the same complexity classes. The variants with no errors in the spectrum are polynomially solvable [19, 8] while the variants assuming presence of errors in the data (negative errors, positive errors, or both) are all strongly NP-hard [7, 8]. Because errors are present
منابع مشابه
A heuristic approach for multi-stage sequence-dependent group scheduling problems
We present several heuristic algorithms based on tabu search for solving the multi-stage sequence-dependent group scheduling (SDGS) problem by considering minimization of makespan as the criterion. As the problem is recognized to be strongly NP-hard, several meta (tabu) search-based solution algorithms are developed to efficiently solve industry-size problem instances. Also, two different initi...
متن کاملA heuristic managing errors for DNA sequencing
MOTIVATION A new heuristic algorithm for solving DNA sequencing by hybridization problem with positive and negative errors. RESULTS A heuristic algorithm providing better solutions than algorithms known from the literature based on tabu search method.
متن کاملTabu-KM: A Hybrid Clustering Algorithm Based on Tabu Search Approach
The clustering problem under the criterion of minimum sum of squares is a non-convex and non-linear program, which possesses many locally optimal values, resulting that its solution often falls into these trap and therefore cannot converge to global optima solution. In this paper, an efficient hybrid optimization algorithm is developed for solving this problem, called Tabu-KM. It gathers the ...
متن کاملComparison of particle swarm optimization and tabu search algorithms for portfolio selection problem
Using Metaheuristics models and Evolutionary Algorithms for solving portfolio problem has been considered in recent years.In this study, by using particles swarm optimization and tabu search algorithms we optimized two-sided risk measures . A standard exact penalty function transforms the considered portfolio selection problem into an equivalent unconstrained minimization problem. And in final...
متن کاملGENETIC AND TABU SEARCH ALGORITHMS FOR THE SINGLE MACHINE SCHEDULING PROBLEM WITH SEQUENCE-DEPENDENT SET-UP TIMES AND DETERIORATING JOBS
This paper introduces the effects of job deterioration and sequence dependent set- up time in a single machine scheduling problem. The considered optimization criterion is the minimization of the makespan (Cmax). For this purpose, after formulating the mathematical model, genetic and tabu search algorithms were developed for the problem. Since population diversity is a very important issue in ...
متن کاملComparison of Simulated Annealing, Genetic, and Tabu Search Algorithms for Fracture Network Modeling
The mathematical modeling of fracture networks is critical for the exploration and development of natural resources. Fractures can help the production of petroleum, water, and geothermal energy. They also greatly influence the drainage and production of methane gas from coal beds. Orientation and spatial distribution of fractures in rocks are important factors in controlling fluid flow. The obj...
متن کامل